Discovering Test Set Regularities in Relational Domains
نویسندگان
چکیده
Machine learning typically involves discovering regularities in a training set, then applying these learned regularities to classify objects in a test set. In this paper we present an approach to discovering additional regularities in the test set, and show that in relational domains such test set regularities can be used to improve classification accuracy beyond that achieved using the training set alone. For example, we have previously shown how FOIL, a relational learner, can learn to classify Web pages by discovering training set regularities in the words occurring on target pages, and on other pages related by hyperlinks. Here we show how the classification accuracy of FOIL on this task can be improved by discovering additional regularities on the test set pages that must be classified. Our approach can be seen as an extension to Kleinberg’s Hubs and Authorities algorithm that analyzes hyperlink relations among Web pages. We present evidence that this new algorithm leads to better test set precision and recall on three binary Web classification tasks where the test set Web pages are taken from different Web sites than the training set.
منابع مشابه
Discovering Domains Mediating Protein Interactions
Background: Protein-protein interactions do not provide any direct information regarding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting domain pairs. However they do not consider the in...
متن کاملAn Effective Algorithm for Discovering Fuzzy Rules in Relational Databases
In this paper, we present a novel technique, called F-APACS, for discovering fuzzy association rules in relational databases. Instead of dividing up quantitative attributes into fixed intervals and searching for rules expressed in terms of them, F-APACS employs linguistic terms to represent the revealed regularities and exceptions. The definitions of these linguistic terms are based on fuzzy se...
متن کاملDiscovering regularities from knowledge bases
Knowledge bases open new horizons for machine learning research. One challenge is to design learning programs to expand the knowledge base using the knowledge that is currently available. This paper addresses the problem of discovering regularities in large knowledge bases that contain many assertions in diierent domains. The paper begins with a deenition of regularities and gives the motivatio...
متن کاملInductive Logic Programming for Discovering Financial Regularities
The purpose of this work is discovering regularities in financial time series using Inductive Logic Programming (ILP) and related "Discovery" software system [Vityaev et al., 1992,1993] in data mining. Discovered regularities were used for forecasting the target variable, representing the relative difference in percent between today's closing price and the price five days ahead. We describe the...
متن کاملDiscovering Regularities in Databases Using Canonical Decomposition of Binary Relations
Regularities in databases are directly useful for knowledge discovery and data summarization. As a mathematical background, relational algebra helped for discovering the main data structures and existing dependencies between the different attributes in a relational database. Functional, difunctional and other kinds of dependencies in a relational database describe invariant regular structures t...
متن کامل